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ABSTRACT 

Although grades have been criticized for lack of 
reliability, end- of course grades and grade-point averages are 
reliable enough for most uses. The charge of unreliability applies 
only to grades on theses, tests, or other individual pieces of 
student work. On the other side of the controversy, grades have been 
said to be essential to the learning process because they provide for 
the evaluation of student performance. But performance is evaluated 
and its results reported to students independently of any grading 
system. The justification for grades must lie elsewhere. The critical 
issue in grading is the validity and usefulness of grades for the 
variety of purposes they are called on to serve — conveying 
information on student achievement, providing incentives for students 
to study, serving as selection criteria, providing material for 
administrative records, helping in the evaluation and monitoring of 
the instructional process, and assisting students in educational and 
occupational planning. Until better information is available on the 
effectiveness of grades with respect to these various functions, the 
continued trading of unsupported assertions about them will be 
fruitless. Nev approaches to grading, such as contract and 
criterion-referenced grading, do not change the basic issues. 
(Author/RC) 
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The questions raised about grading practices are apt to include 
the following: 

What should grades represent? 

Should student effort be reflected in grades? 
Are grades adequately reliable? 

How can their reliability antf fairness be.- improved? 
What set of symbols should be used? 0 

0 

Questions similar to these can be found in many educational 
bocks and j ournals of the 1970s* Those listed above were taken from 
a monograph published niore than 60 years ago (9). Answers to the 
questions and disagreement about the answers have also changed 
little during the intervening years. Finkelstein (9) and Ebel (8) 
agree that grades should be based only on. student accomplishment. 
On the other hand, grades are used to reflect pupil growth, 
aptitude, effort, and attitudes as welj. as achievement ( 25) . They 
should reflect achievement in relation to ability, according to 
Kindsvatter (13); the^ should not according to Terwilliger (23), 

Ebel (8), and others. Still others would permit grades to represent 

both effort and achievement and perhaps other qualities, but 
independently of one another, (for example, 10) n , while Ebel (8) 
points out problems £hat the use of multiple criteria adds to the 
already complicated process of providing useful grades. 

More than 60 years ago, Kelly (12) reviewed studies available 
at that time of the variability of grade distributions across 
colleges and universities, across fields within institutions and 
across faculty members within fields and concluded that "a given 
grade or mark means, many widely different things w o different, 
teachers." The same concern is an issue today (2_). Clear 
specification of the grading criteria - whatever they may be - and 
the meaning of the points on the particular grade scale used are 
stated as major requirements for the improvement of grading (3, 7, 
8 , and 10). 

The continued vehemence of the controversy over the grading 
process without apparent progress after more than half a century is 
puzzl ing • Al though grading is in some respects a complex, shifting 
issue, difficult to come to grips with, about 65 years of study and 
debate would ordinarily b.e expected to clarify or redefine the issue 
at least, if not produce some clear advances in under standing and 
practice. Hiner (11) accounts for the lack of progress by ascribing 
to the grading process the characteristics of a cultural ritual that 
serves to reduce the impact of a fundamental social problem. The i 
problem eased by the grading ritual, according to Hiner, is the 
distribution of rewards in a society that simultaneously values 
achievement and equality. Grades, in acting as a buffer between two 
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conflicting values, will necessarily be a source of conflict until 

the Competing cultural values are changed. Some such culturally 

based interpretation as Hiner's seems necessary in view of the 

persistence and ardor of the grading controversy and the frequent 

absence from the arguments of anything more than assertions of 
belief. . '< 

*. * \ 

Those in favor of drastically modifying or even abolishing 
current practices claim that grading interferes with learning by 
creating anxiety in students as well as a distaste for school, by 
forcing them to spend their time and energy on grades rather than 
learning, and by making teachers and students antagonists instead of 
allies in the learning process. Those who defend grading say it is 
essential to learning because students would not work as hard 1% the 
absence of grades and because effective learning requires the 
knowledge of results that grades provide. The pro-grade faction 
argues further that grades are necessary in the selection and 
placement of students in successively higher levels of education. 
i Both positions are based on the asserted effects of grades oil 
learning, but the common ground on which they can be compared is 
limited. While the two factions agree, for example, that grading 
increases the level of competition in learning, the pro—grade 
faction approves of competition and the anti-grade factlpn 
disapproves of it. Neither faction has pursued the impl £<Lat ions of 
competition far enough to be forced to resolve their apparently 
comfortable conflict. 



Five years ago, a review of the growing literature on grading 
(24) showed the situation to be m/ich the same as it is today. 
Despite a continuing flood of reports on experimental grading 
procedures and frequent broad s ides from! both camps, only one advance 
in understanding has been made in the Five years since the earlier . 
review. The evidence is now overwhelming that pass-fail grading 
does not induce students to take coursejs they would have avoided 
under a traditional gVa^Ung system for ;fear of depressing their 
grade-point average. P r o po~fre*U^so f pass— fail grading have therefore 
lost one of their major positions, but lone that seems not to be 
critical in the broader controversy. j 

Despite continuing moderate interest in pass-fail grading, on 
the college level it is rarely allowed to apply to any more than a 
minimal part of a student's program. It is less common still in 
high school (19). In elementary school, where it more commonly 
appears as S-U (satisfactory - unsatisfactory), its impact is 
modified by the use of several rating scales, which gives the 
appearance of providing enough information to satisfy the users of 
grades. The major impact of pass-fail grading in college seems to 
be its contribution, to an unknown degree , to the inflation of 
undergraduate college grades. 



ERIC 



4 



G r ad e Inflation j ' 

Since the late 1960s, college grade-point averages have risen by 
about half a letter grade (20); the most common college grade is now 
a B rather than/a C* Possible contributing factors other than an 
increase in paj/s-fail grading, which ,may have reduced the number of 
Cs and Ds proportionately more than the As and Bs , include the 
refusal of faculty member? to give low grades to college men when 
satisfactory /grades could keep them from being drafted; a growing 
dissatisfaction with the grading process as a whole; increasing 
frequency olt field experience, internships, and other non tr ad i tional 
kinds of courses in which grades below a 3 seem to be rare; and, to 
a small extent, the recently growing practice of nonpunitive 
grading — -khe abandonment of Fs .in favor of simply not reporting a 
grade* While each of these may have contributed to the rise in 
grade-point averages over the past decade, the proportion of As and 
Bs relative to all other grades, including pass, has grown 
substantially* Pass-fail and nonpunitive grading cannot have 
contributed to that increase . \ 

/The growth of nonpunitive grading (either pass-no report or 
ABCy, with X indicating that no grade was being reported) followed 
pass-fail grading by a few.years. Arguments in its favor seemed 
overwhelming in comparison with arguments against it* The primary 
opposition was based on one fear that permitting students to take a 
course and fail it, without keeping a record of their failure, would 
1/ead to an overburdening of college resources by poorly prepared or 
frivolous students who would crowd classes in search of a few in 
/which the> might succeed, denying places to better prepared or more 
serious students* Yet there are simple procedures, such as 
requiring students to maintain a minimal rate of course completion 
for continued enrollment, that would prevent abuse of a no-fail 
grading system* Other reasons for opposition have been the desire 
to prevent the range of the grade-point average from shrinking by 
maintaining the bottom grade and the belief of some faculty members 
that failure is a phenomenon from which students should not be 
sheltered* None of these objections to the removal of failure from 
the grading process is compelling, yet recent shifts to nonpunitive 
grading are being reversed* 

While college grades have clearly risen since the mid or late 
1960s, the recent furor over grade inflation seems to be reversing 
that trend - a trend that never was universal ('14) • The. desire of 
graduate and professional schools to restore greater spread" to the 
undergraduate grade-point average for greater ease of, selection and 
the desire of faculty members and some students to maintain a system 
of public recognition of superior performance are probably the 
primary reasons behind current efforts to roll back the recent rise 
in college grades* 



5 



4 



o 

Purposes of Grades 

Basic to any consideration of the good or evil that results from the 
use of grades is an understanding of their purposes and of the ways 
the grading process accomplishes them* There is general agreement 
on the purposes themselves, hut misunderstanding has persisted over 
how they are achieved. 

At precollege levels of education, one of the primary purposes 
of grades is to report to parents on the school performance of their 
children. Many writers consider this the most important function of 
grades at precollege levels, but at the college level that purpose 
almost disappears. t At. the elementary level, the most common form of 
grade is the dichotomous Satisfactory-Unsatisfactory applied to a 
varied group of qualities that may include knowledge, growth, 
potential achievement, effort, attitudes, conduct, attendance, 
citizenship, or character (8, 25, and 26). The use of checklists 
showing each of the qualities to be assessed and reported is fairly 
common in elementary schobls but less so at the secondary level (1 
and 19). In college it is quite rare. With each advancing 
educational Tevel , then, the grading process shifts from multiple 
criteria, each graded S or U, to a single criterion - academic 
achievement - graded at several levels. Other considerations, suc> 
as effort, attitude, or interest, enter the process to a degree 
known only to the individual teachers assigning the grades. With 
this shift in the form of grading used at successively higher 
educational levels, the amount of information conveyed by grades in 
a formal sense is reduced, but the amount needed has dropped as 
well. At the college level grades are rarely used to provide 
information to parents, and the students already have most of the 
information in grades. The student who does not know in advance 
with reasonable accuracy what his grades will be is rare. 

The second conmonly cited purpose of grades is to induce 
students to study. The nature of the inducement and its effect, 
however, are complicated. Achievement orientation, age, and the 
degree of competitiveness in the testing situation can be expected 
to interact in their effects on student performance, reports 
McKeachie (15). Other studies he reviewed showed that the effects 
of teachers' evaluative communications to students depended on 
whether they were primarily informational or contained an element of 
praise. \ 

A recent study of high school students showed the threat of low 
grades to be a greater inducement to study than the appeal of high 
grades, and both to be more effective than no grades at all (5). 
These results raise new questions about the move in some colleges 
and universities to abolish failing grades. But the authors warn 
that a large number of complicating factors probably affect the role 
of grades as incentives. - 

At the elementary level grades probably serve pupils almost , 
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entirely as indicators of adult approval and therefore of personal 
worth. As the educational level advances, grades take on additional 
meaning as indicators to students that they have accomplished 
something of value that is independent of adult approval. The 
importance of grades to parents also increases with each grade 
level, which affects their importance to students in complicated 
ways. Furthermore, the higher the educational level, the more 
important grades become as a valued commodity in their own right, as 
some high school students become concerned about admission to 
college and college students begin planning for graduate or 
professional school. The motivating function of grades therefore 
varies, depending on the student*^ educational level, his or her 
intentions and inclinations, family relationships, personality, and 
other characteristics of student and school.- 

At the college level, and to some extent at the secondary 
level, grades are used for selection and placement. High school 
grades determine selection by particular colleges and universities 
and college grades determine selection by graduate and professional 
schools. In high school and junior high school as well as college, 
grades help determine whether students are placed in select, 
regular, or remedial classes. As students progress, this, function 
of grades becomes more critical . In secondary schools, placement 
may rest primarily on the judgment of individual teachers, with 
grades acting only as prompters. Faculty judgment operates somewhat 
less strongly in placement in college classes but still has some 
effect in admission to college and to graduate and professional 
schools through faculty recommendations. In these selection and 
placement functions, grades operate as summary statements of a 
number of teacher judgments. 

The function of grades as concise indicators of teacher 
judgments gives them an important role as the primary criterion 
against which selection and placement decisions are validated. Both 
tfypes of decision are intended to'assure that students will be 
Engaged in programs at a level at which they can succeed. Whether 
based on prior grades, teacher recommendations, test scores, prior 
school experience , or other considerations,, the success of selection 
and placement decisions is usually determined through reference to 
grades achieved. Other criteria could be used and for many purposes 
would be preferable to grades, a^s when the purpose of selection is 
to assure the presence in a program of students who will benefit 
most, in some specified sense, from the particular program offered. 
But grades are familiar, convenient, and almost universally accepted 
as the dominant criterion for successful selection and placement. 
Standardized tests of academic aptitude and achievement depend 
almost exclusively for their acceptance on the degree to which they 
are related tp grades. 

Providing information to parents, students, and others about 
the level of student achievement, providing a basis for admission tc 
more advanced educational programs, and serving as inducements to 
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study are the three most commonly agreed on purposes of grades. 
Others include helping students choose educational and occupational 
goals, monitoring the effectiveness of instructional programs, and 
providing information for administrative records for purposes of 
promotion, awards, probation, d ismissal , and" other administrative 
decisions. 

Points of Confusion in Grading 

The reliability of grades, which Finklesteir. and others (9 and 23) 
questioned more than 60 years ago, is still a source of confusion 
because of uncertainty about which point in the grading process is 
at issue. At an early point in the process, individual instances of 
student performance — tests, quizzes, written reports, essays or 
themes — are evaluated and graded. Later, the teacher combines these 
grades with impressions he has derived from informal observations of 
the student during the course of a school term to arrive at an 
overall grade for the course. While there is still disagreement as 
to whether qualities like students' attitudes, interest in the 
course, or effort should enter into course grades, the weight of 
Opinion favors limiting course grades to reporting academic 
achievement. Nevertheless, faculty members, even in advanced 
college courses, are often reluctant to give the same grude to two 
students equal in achievement , one a very bright student who coasted 
without effort and with little interest through the course and 'the 
other a less bright but conscientious student who worked hard for 
what he or she learned in the course. Finally, course grades are 
combined into grade-point averages for use in selection by colleges 
and universities, graduate and professional schools, honorary 
societies and scholarship and fellowship awards committees. Despite 
substantial differences in the characteristics of grading for each 
of these, the distinc tionss are often ignored in discussions of 
grades. 

One of the important differences among grades on, particular 
student product^, course grades, and grade-point averages is their 
reliability, an issue raised by Finkelstein (9). The judgments made 
by teachers in assigning grades to individual pieces of work are 
frequently unreliable in the sense that the same teacher may not 
assign the same grade to the same piece of work if, judged at 
different times, and different teachers often would assign different 
grades to the same product. This is usually the part of the grading 
process people refer to when they cite the unreliability of grades. 
But overall course grades are reliable or consistent; enough across 
faculty members and in the course of reasonable periods of time to 
be useful measures of achievement. And grade-point averages are 
more reliable still (2 and 4). They are*generally reliable enough 
to justify their use in admission decisions about individual, 
students; and many of the alternative selection criteria , such as 
recommendations or judgments based on interviews, are far less 
reliable. Course grades are also reasonably reliable indicators of 
comparative levels of student achievement. Even the grades for 
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individual pieces of work can be made reliable through clear 
definition of the qualities to be judged and careful observation and 
assessment of the evidence on which the judgments are to be based* 

Validity 

The issue on which grading, as it is typically carried out, is 
vulnerable to criticism is not reliability but validity. As 
predictors of future .grades, grades are about as valid as anything 
else. But the prediction of future grades is a limited basis for 
deciding on the validity or usefulness of grades. The variety of 
purposes grades are intended to serve implies numerous other 
validities for grades. A procedure with relatively low reliability 
that is nonetheless valid for a particular purpose is preferable to 
a more reliable. measure that is unrelated to the behavior of 
interest. The same process is unlikely to serve all purposes 
equally we 11. 

The different uses and validities of grades are the aspect of 
grading most in need of study and most subject to unsupported 
assertions that grades are either good or bad. Studies of grading 
processes have provided evidence on the validity of grades for 
particular purposes, usually admission, but they are too narrow in 
their focus to provide much knowledge about grading as a major 
educational institution, or, in Hiner's (11) view, as a cultural 
ritual. The value of grades in all 'their functions should be 
studied if the controversy is to continue. 

Evaluation and Grading 

Another source of confusion in considerations of grading is the 
distinction between evaluating a student's performance and reporting 
the result of that evaluation in the form of a grade. A familiar 
evaluation process is the one in which a teacher reads a student's 
paper and judges her perception in bringing certain arguments to 
bear, her grasp of the underlying issues, the coherence of her ( -» 
presentation, the soundness of the conclusions she reaches, and the 
elegance and correctness of her prose. The teacher can. evaluate 
each of these aspects of the paper and write extensive comments to 
the student without any concern for grading. 

Because grades are typically required at the end of a course, 
after the evaluation process just described, the teacher, will 
probably estimate how the paper compares with those of Qther 
students in comparable courses and assign a grade based on that 
comparison. But evaluating the student' s work and discussing it 
with her - the parts of the grading process that are so important to 
learning - can take place without any concern for grading at all. 
This confusion between eval ua t ion and grading is the source of many 
vehement asse tions in the literature that grading is essential to 
student learning. Evaluation is essential; grading is not. 
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Maintaining the distinction between evaluation and grading also 
clarifies discussions of the effect of grades on motivation. 
Grading itself clearly has consequences for students' motivations,* 
as when the fear of receiving less than an A induces a premedical 
student to study for a biology test insttad of immersing himself in 
Dostoyevsky' s Notes from Underground . But the motivational 
consequences for students of knowing that their work will be 
critically evaluated are as clearly of a different sort and have 
little if anything to do with grading. 

Criterion-referenced and Contract Grading 

Although criterion-referenced and contract grading are not the same, 
they are closely related. Furthermore, they are both relatively new 
considerations in discussions of grading. 

In criterion-referenced grading, the performance expected - 
both kind and level - is specified in a way that permits its 
accomplishment by each pupil to be "observed independently of the 
performance of others (17). Thus ''Adds three single-digit numbers" 
is- a kind of performance that can be ( observed in an individual pupil 
without reference to other pupils. That performance plus others in 
arithmetic and in other subjects can be listed and checked off as 
students learn to do them, providing a comprehensive record of each 
pupil's achievement. Used as a report card, such a checklist would 
be similar to those advocated by Wrinkle (26) and in use for about 
10 percent of elementary pupils (1). The element that is new and 
distinctive is the specification of the criterion of performance 
accomplished instead of the judgment that performance, is 
satisfactory. 

Contract grading occurs when the student and faculty member, 
usually in college but occasionally in high school, agree at the 
beginning of a course on the amount and quality of work the student 
must do to earn a given grade (6, 16, and 18). The contract may be 
wr it ten ;So that its completion can be determined unambiguously — for 
example, set up and carry out an analytic procedure using specified 
processes to identify the chemical impurities and their 
concentrations in a sample of river water. When a student has met 
such specified performance criteria, he or she has completed the 
course and will receive the agreed-upon grade. Contract and 
criter ion- referenced grading can thus be readily combined. Contract 
grading does not necessarily require the statement of precise 
criteria, brut problems are avoided if it does. In any case, the 
procedures for evaluating the student's work and the quality 
required, if the contract is to be satisfied, should both be 
specified • 

When used, sensibly, both criterion-referenced and contract 
grading can be effective devices for indicating student achievement 
but they are not always easy to use. Even at the elementary level, 
where academic requirement s can often, be clearly and simply stated, 
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as in "Adds three single-digit numbers," problems in assessment and 
reporting are not totally avoided. How many times must 
second-traders, for example, add three single-digit numbers without 
error to be considered proficient? How long must they maintain \ 
their proficiency? Some ninth-graders have learned to multiply 
simple fractions successfully every term since the sixth grade, yet 
start a new class in arithmetic having to learn it once more. A 
teacher's judgment of proficiency may* reflect subtle yet important 
aspects of performance that specifically stated criteria may miss. 
While the criteria can be worded to accommodate questions of 
persistence in learning and other complexities, they cannot 
reasonably be elaborated indefinitely. 

The value of both criterion-referenced and contract grading is 
in their requirement that educational objectives be carefully 
examined and clearly defined, and that procedures for assessment be 
carefully worked out in advance. In contract grading, the 
participation of the student in establishing the evaluation 
procedure as well as the course objectives minimizes the common 
student complaint that the tests in a course were unfair. 

The Present Status of the Grading Controversy 

Each of the questions raised long ago by Finkelstein is still alive, 
although whether grades are reliable or not no longer deserves to 
be. Many grades on individual pieced of student work are quite 
unreliable; course grades are usually reliable enough for many 
purposes; grade-point averages are quite reliable. What grades 
should represent is a more complex issue than whether they should 
reflect effort or attitude. Disagreement still exists on effort and 
attitude, although consensus is against them. Consideration of the 
kinds of performance grades should, represent cannot be separated 
from consideration of the purposes for which the grades will be 
used, an issue that complicates questions of validity and bears on 
Finkelstein* s concern for fc mess. What set of symbols 0 should be 
used may be a more lively issue today than in Finkelstein* s time. 
It is usually concerned today, as then, with the number of points to 
be included on the grading scale, although instances still occur in 
which nothing is changed but the label - High Honors, Honors, Low 
Honors, Pass, and Fail as a replacement for ABCDF. 

Studies of grades and surveys of grading practices are numerous 
and pointlessly .repetitious . The mpst promising way to get off the 
grading merry-go-round is to focus studies of grades on their 
several purposes and on alternative ways of accomplishing them. If 
the questions to be pursued are selected and phrased to be critical 
to both sides of the controversy rather than simply confirming a 
point that is not in serious dispute, some advance may be made. But 
if grades do indeed serve unacknowledged functions, such as easing 
social conflict or giving insecure or punitive faculty members a 
device for controlling students, even sound, well-executed studies 
will not resolve the controversy. 
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